540 research outputs found

    Teaching the Old Dog New Tricks: Supervised Learning with Constraints

    Full text link
    Adding constraint support in Machine Learning has the potential to address outstanding issues in data-driven AI systems, such as safety and fairness. Existing approaches typically apply constrained optimization techniques to ML training, enforce constraint satisfaction by adjusting the model design, or use constraints to correct the output. Here, we investigate a different, complementary, strategy based on "teaching" constraint satisfaction to a supervised ML method via the direct use of a state-of-the-art constraint solver: this enables taking advantage of decades of research on constrained optimization with limited effort. In practice, we use a decomposition scheme alternating master steps (in charge of enforcing the constraints) and learner steps (where any supervised ML model and training algorithm can be employed). The process leads to approximate constraint satisfaction in general, and convergence properties are difficult to establish; despite this fact, we found empirically that even a na\"ive setup of our approach performs well on ML tasks with fairness constraints, and on classical datasets with synthetic constraints

    Modeling and Analysis of Fault Tolerant Multistage Interconnection Networks

    Get PDF
    Performance and reliability are two of the most crucial issues in today\u27s high-performance instrumentation and measurement systems. High speed and compact density multistage interconnection networks (MINs) are widely-used subsystems in different applications. New performance models are proposed to evaluate a novel fault tolerant MIN arrangement, thereby assuring performance and reliability with high confidence level. A concurrent fault detection and recovery scheme for MINs is considered by rerouting over redundant interconnection links under stringent real-time constraints for digital instrumentation as sensor networks. A switch architecture for concurrent testing and diagnosis is proposed. New performance models are developed and used to evaluate the compound effect of fault tolerant operation (inclusive of testing, diagnosis, and recovery) on the overall throughput and delay. Results are shown for single transient and permanent stuck-at faults on links and storage units in the switching elements. It is shown that performance degradation due to fault tolerance is graceful while performance degradation without fault recovery is unacceptable

    Codes for Limited Magnitude Error Correction in Multilevel Cell Memories

    Get PDF
    Multilevel cell (MLC) memories have been advocated for increasing density at low cost in next generation memories. However, the feature of several bits in a cell reduces the distance between levels; this reduced margin makes such memories more vulnerable to defective phenomena and parameter variations, leading to an error in stored data. These errors typically are of limited magnitude, because the induced change causes the stored value to exceed only a few of the level boundaries. To protect these memories from such errors and ensure that the stored data is not corrupted, Error Correction Codes (ECCs) are commonly used. However, most existing codes have been designed to protect memories in which each cell stores a bit and thus, they are not efficient to protect MLC memories. In this paper, an efficient scheme that can correct up to magnitude-3 errors is presented and evaluated. The scheme is based by combining ECCs that are commonly used to protect traditional memories. In particular, Interleaved Parity (IP) bits and Single Error Correction and Double Adjacent Error Correction (SEC-DAEC) codes are utilized; both these codes are combined in the proposed IP-DAEC scheme to efficiently provide a strong coding function for correction, thus exceeding the capabilities of most existing coding schemes for limited magnitude errors. The SEC-DAEC code is used to detect the cell in error and correct some bits, while the IP bits identify the remaining erroneous bits in the memory cell. The use of these simple codes results in an efficient implementation of the decoder compared to existing techniques as shown by the evaluation results presented in this paper. The proposed scheme is also competitive in terms of number of parity check bits and memory redundancy. Therefore, the proposed IP-DAEC scheme is a very efficient alternative to protect and correct MLC memories from limited magnitude errors.Pedro Reviriego was partially supported by the TEXEO project (TEC2016-80339-R) funded by the Spanish Research Plan and by the Madrid Community research project TAPIR-CM grant no. P2018/TCS-4496

    Introduction to the Special Section on Nano Systems and Computing

    Get PDF
    It is with great pleasure that we introduce the special section on Nano Systems and Computing to the readership of the IEEE Transactions on Computers. This special section consists of five papers that have been selected to cover a wide spectrum of techniques which are encountered in the design of nano-scale computing systems

    Selective Neuron Re-Computation (SNRC) for Error-Tolerant Neural Networks

    Get PDF
    Artificial Neural networks (ANNs) are widely used to solve classification problems for many machine learning applications. When errors occur in the computational units of an ANN implementation due to for example radiation effects, the result of an arithmetic operation can be changed, and therefore, the predicted classification class may be erroneously affected. This is not acceptable when ANNs are used in many safety-critical applications, because the incorrect classification may result in a system failure. Existing error-tolerant techniques usually rely on physically replicating parts of the ANN implementation or incurring in a significant computation overhead. Therefore, efficient protection schemes are needed for ANNs that are run on a processor and used in resource-limited platforms. A technique referred to as Selective Neuron Re-Computation (SNRC), is proposed in this paper. As per the ANN structure and algorithmic properties, SNRC can identify the cases in which the errors have no impact on the outcome; therefore, errors only need to be handled by re-computation when the classification result is detected as unreliable. Compared with existing temporal redundancy-based protection schemes, SNRC saves more than 60 percent of the re-computation (more than 90 percent in many cases) overhead to achieve complete error protection as assessed over a wide range of datasets. Different activation functions are also evaluated.This research was supported by the National Science Foundation Grants CCF-1953961 and 1812467, by the ACHILLES project PID2019-104207RB-I00 and the Go2Edge network RED2018-102585-T funded by the Spanish Ministry of Science and Innovation and by the Madrid Community research project TAPIR-CM P2018/TCS-4496.Publicad

    Evaluating the Repair of System-on-Chip (SoC) using Connectivity

    Get PDF
    This paper presents a new model for analyzing the repairability of reconfigurable system-on-chip (RSoC) instrumentation with the repair process. It exploits the connectivity of the interconnected cores in which unreliability factors due to both neighboring cores and the interconnect structure are taken into account. Based on the connectivity, two RSoC repair scheduling strategies, Minimum Number of Interconnections First (I-MIN) and Minimum Number of Neighboring Cores First (C-MIN), are proposed. Two other scheduling strategies, Maximum Number of Interconnections First (I-MAX) and Maximum Number of Neighboring cores First (C-MAX), are also introduced and analyzed to further explore the impact of connectivity-based repair scheduling on the overall repairability of RSoCs. Extensive parametric simulations demonstrate the efficiency of the proposed RSoC repair scheduling strategies; thereby manufacturing ultimately reliable RSoC instrumentation can be achieved

    Testing Layered Interconnection Networks

    Get PDF
    We present an approach for fault detection in layered interconnection networks (LINs). An LIN is a generalized multistage interconnection network commonly used in reconfigurable systems; the nets (links) are arranged in sets (referred to as layers) of different size. Switching elements (made of simple switches such as transmission-gate-like devices) are arranged in a cascade to connect pairs of layers. The switching elements of an LIN have the same number of switches, but the switching patterns may not be uniform. A comprehensive fault model for the nets and switches is assumed at physical and behavioral levels. Testing requires configuring the LIN multiple times. Using a graph approach, it is proven that the minimal set of configurations corresponds to the node disjoint path sets. The proposed approach is based on two novel results in the execution of the network flow algorithm to find node disjoint path sets, while retaining optimality in the number of configurations. These objectives are accomplished by finding a feasible flow such that the maximal degree can be iteratively decreased, while guaranteeing the existence of an appropriate circulation. Net adjacencies are also tested for possible bridge faults (shorts). To account for 100 percent fault coverage of bridge faults a postprocessing algorithm may be required; bounds on its complexity are provided. The execution complexity of the proposed approach (inclusive of test vector generation and post-processing) is O(N4WL), where N is the total number of nets, W is the number of switches per switching element, and L is the number of layers. Extensive simulation results are provided

    Probabilistic Balancing of Fault Coverage and Test Cost in Combined Built-In Self-Test/Automated Test Equipment Testing Environment

    Get PDF
    As design and test complexities of SoCs ever intensify, the balanced utilization of combined built-in self-test (BIST) and automated test equipment (ATE) testing becomes desirable to meet the required minimum-fault-coverage while maintaining an acceptable cost overhead. The cost associated with combined BIST/ATE testing of such systems mainly consists of 1) the cost induced by the BIST area overhead and 2) the cost induced by the overall testing time. In general, BIST is significantly faster than ATE, while it can provide only limited fault-coverage, and driving higher fault-coverage from BIST means additional area cost overhead. On the other hand, higher fault-coverage can be easily achieved from ATE, but excessive use of ATE results in additional test time. This paper proposes a novel probabilistic method to balance the fault-coverage and the test overhead costs in a combined BIST/ATE test environment. The proposed technique is then applied to two BIST/ATE test scenarios to find the optimal fault-coverage/cost combinations
    • …
    corecore